Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of parameter definitions in MESSAGEix documentation #275

Merged
merged 7 commits into from
Dec 19, 2019

Conversation

GamzeUnlu95
Copy link
Contributor

@GamzeUnlu95 GamzeUnlu95 commented Nov 27, 2019

Closes #260.
Two other issues that are related and might also be addressed in this PR: #241 and #51.

This PR updates the parameter documentation and adds explanations for the parameters that are not self-explanatory. The parameter explanations are based on the definitions provided in parameter_def.gms file.

Even though the initial idea was to add the definitions in the tables for every parameter, after a discussion with @volker-krey , we agreed that it is better to add these definitions only for parameters that are not clear since most of the parameters are already self-explanatory (e.g. bound_activity_up, construction_time etc.) and the tables look exhaustive (especially in the pdf version of the documentation) when all the definitions are added.

The additional information and related links are added to the footnotes and to the explanation paragraphs for the parameters/sections that might require more explanation. (e.g. rating_bin, add-on technologies, cost parameters for ‘soft’ relaxations of dynamic constraints, auxiliary investment cost parameters and multipliers, emission scaling, relations, resources).

These additions are based on the parameter_def.gms file which includes only basic definitions of the parameters. The reviewers could suggest changes to the explanations in case they are not clear enough (Especially for the section of "Auxiliary investment cost parameters and multipliers").

The documentation page can be built by using the command: sphinx-build [source-directory] [output-directory]

source-directory: C:\....\message_ix\doc\source
output directory: C:\....\message_ix\doc\build

@khaeru
Copy link
Member

khaeru commented Nov 28, 2019

@GamzeUnlu95 thanks for this. I have turned on the automatic RTD builds of this branch, and they seem to complete succesfully! Here is the page you modified: https://message.iiasa.ac.at/en/parameter_doc_new/model/MESSAGE/parameter_def.html

Please solicit others' input, in the form of comments on this PR, as to which (Option 1 or Option 2) is the clearest form for the documentation.

@khaeru
Copy link
Member

khaeru commented Nov 28, 2019

Also note that, per our in-person discussion, message_ix tests are currently failing because iiasa/ixmp#225 needs to be completed, reviewed, and merged.

@khaeru
Copy link
Member

khaeru commented Dec 9, 2019

@GamzeUnlu95 —commit 800c35a seems to revert the addition of descriptions and the Option 2 table that were added by bb8a35b. Was that intentional?

@GamzeUnlu95
Copy link
Contributor Author

@khaeru , that was intentional. After converting to pdf and a discussion with Volker, we decided not to include all the definitions in the tables, but include them in the footnotes or in the paragraphs where it is relevant. I also updated the PR definition.

@francescolovat
Copy link
Contributor

Hi @GamzeUnlu95,
thanks for you work.

I've edited the description of the PR to include that closes issue #260. You've used the command closes the other way around: you wrote closes #275 in the issue itself. The command closes issues from PRs. Therefore I also deleted that comment in #260.

Now I'm going through what you've wrote and I'll evidence very minor detail and also push a commit correcting a typo to also see if all the CIs pass after iiasa/ixmp#225 was merged.

Comment on lines 215 to 216
* .. [#renewables] ``renewable_capacity_factor`` refers to the quality of renewable potential by grade and ``renewable_potential`` refers to the size of the renewable potential per grade.
*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the sentence: refers to the quality of renewable potential by grade here. Do you think it is enough? Probably I missed other previous descriptions that could help me understand it.

@codecov
Copy link

codecov bot commented Dec 12, 2019

Codecov Report

Merging #275 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #275   +/-   ##
=======================================
  Coverage   78.64%   78.64%           
=======================================
  Files          13       13           
  Lines        1138     1138           
=======================================
  Hits          895      895           
  Misses        243      243

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 40ad4b5...8ea28c2. Read the comment docs.

@francescolovat
Copy link
Contributor

francescolovat commented Dec 12, 2019

Looking for some other wrong parameter definitions in Parameter definition page, as emphasized in #51, I have a doubt now:

  • which is the difference between node_loc and node used in the tables in the Parameter definition page? node_loc is not even defined in the Sets definition page, so I don't really know why it's being used to describe parameters and what differentiates it from node.

I thought probably the aim was to differentiate the sets location and node from the model_core.gms file. But, as you can see in the snapshot below, both of them have indistinctly been used to describe parameters having node and location as dimensions in model_core.gms.

Can anyone help me interpret them so I can properly edit their dimensions and eventually close #51 ? @OFR-IIASA @volker-krey @khaeru @behnam-zakeri

image

@khaeru
Copy link
Member

khaeru commented Dec 13, 2019

@francescolovat —that's a great question. Let's also CC @danielhuppmann @gidden per #254.

There are at least three things going on here:

  1. In ixmp (Python/Java code), every dimension of a parameter has both an index set and an optional index/dimension name.
    • See e.g. the init_par() docs.
    • The sets must be actual sets in the Scenario.
    • The names can be anything; e.g. you can have a dimension indexed by year but named 'foo'.
  2. In the MESSAGE GAMS code, there are a number of aliases for sets.
    • See sets_maps_def.gms.
    • For instance, the lines:
      * definition of aliases
      Alias(node,location);
      Alias(node,node2);
      Alias(node,node_share);
      show that e.g. node_share is always an alias of node.
    • This is needed because a GAMS parameter that has the same index set for two of its dimensions can't re-use the same name, e.g. you can't define a GAMS parameter foo(year, year);.
  3. See Disentangle message_ix and ixmp_source #254. While the GAMS definition of MESSAGE's structure is in the .gms files, there is (must be) a parallel definition of the same structure for the Python code to use. At the moment, this is (wrongly, IMO) hidden in the non-open ixmp_source Java package.

So what you've uncovered (I hadn't noticed myself!) is:

  • There is not necessarily a correspondence between a dimension name on the Python side and any alias on the GAMS side.
    • E.g. node_loc is a dimension name, but there is no GAMS alias.
    • E.g. land_emission is defined over the GAMS alias location, but on the Python side it does not have a dimension name.
  • Some index names, like node_loc, are hidden in the Java code and thus the correspondence between them and particular sets can only be guessed at.

I hope this helps to understand the situation and to propose improvements.

@francescolovat
Copy link
Contributor

francescolovat commented Dec 13, 2019

Yes, thanks a lot @khaeru , I didn't know about that aliases in GAMS were needed. I'll go through the links you suggested to understand the whole think better. Btw, I've just realized that the last commit a521779 trying to help solving the issue #51 has no sense and it's wrong. I'll revert it now.

@danielhuppmann
Copy link
Member

To quote Douglas Adams: "I apologize for the inconvenience".

There are indeed many things going on here, and it is important to keep two things separate:

  1. data handling - think of a pandas.DataFrame representing parameter data
    Here we need to have:
    • consistency checking, so that an entry in the node-column (e.g., where a technology is located) exists in the set node. This is implemented via index_set (and can be overridden by setting index_set='*').
    • column naming: each column of a dataframe has to have a unique name so that we easily work with it. The column names should be a trade-off between intelligibility and consistency (for example using node_loc for the column that specifies a location for all parameters related to a technology).
  2. mathematical formulation
    The equations of the optimisation problem are quite elaborate to make life easier for a user. For example, the emissions bound of a node automatically considers the emissions of all sub-nodes - otherwise, a user would have to specify specific bounds of sub-nodes (determining the allocation of the constraint between sub-nodes a priori), or make some other elaborate parameter specifications.
    This is implemented in GAMS using multiple nested loops, and this requires aliases to correctly implement the summations or if-else-statements. The primary goal of the alias naming is to make it easy for users to understand an equation.

These two use cases are, in principle, not related.

@GamzeUnlu95 GamzeUnlu95 merged commit 357fb37 into master Dec 19, 2019
@khaeru khaeru deleted the parameter_doc_new branch August 17, 2020 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

documentation of parameters only lists dimensions, but lacks description
5 participants